Feature Extraction & Image Processing,
M. S. Nixon and A. S. Aguado
Elsevier/ Butterworth Heinmann, 2nd Edition 2007
This worksheet is the companion to Chapter 3 and implements the basic image processing operations described therein. The worksheet follows the text directly and allows you to process the eye image.
This Chapter concerns basic image operations, essentially those which alter a pixel's value in a chosen way. We might want to make an image brighter (if it is too dark), or to remove contamination by noise. For these, we would need to make the pixel values larger (in some controlled way) or to change the pixel's value if we suspect it to be wrong, respectively. Let's start with images of pixels, by reading in the image of a human eye.
The gives us 4096 pixels. Each pixel is an eight bit byte (n.b. it's stored in .BMP format) so this gives us 256 possible intensity levels, starting at zero and ending at 255. It's more common to use larger (say 256x256) images, but you won't be tempted to use much larger ones in Mathcad. It's very common to use 8 bits for pixels, as this is well suited to digitised video information.
We describe the occupation of intensity levels by a histogram. This is a count of all pixels with a specified brightness level, plotted against brightness level. As a function, we can calculate it by:
So here's the histogram of our picture of the eye image, p. The bright pixels relate mainly to the skin, the darker ones to the hair.
The most common point operator replaces each pixel by a scaled version of the original value. We therefore multiply each pixel by a number (like a gain), by specifying a function scale which is fed the picture and the gain, or a level shift (upwards or downwards). The function scale takes a picture pic and multiplies it by gain and adds a level
You can change the settings of the parameters to see their effect, that's why you've got this electronic document. Try making it brighter and darker. What happens when the gain is too big (>1.23)?
The difference is clear in the magnitude of the pixels, those in the 'brighter' image are much larger than those in the original image, as well as by comparison of the processed with the original image. The difference between the images is much clearer when we look at the histogram of the brighter image.
Which is what we expect; it's just been moved along the brightness axis (it now starts well after 100), and reveals some detail in the histogram which was obscured earlier.
Generally, for point operators we generate a function which is used as a look up table to find the new value of a point. Pure scaling is a look up table whose graph is a straight line with offset set by the level. The slope of this line can be: i) positive and >1 for magnification;
ii) positive and <1 for reduction;
and iii) negative ( + constant) for inversion.
Try these out!
We might also want to use a more specialised form of look up table, say the saw-tooth operator. For this, we split the brightness range up into bands, and use a linear look-up table in each.
Note that we had to use our earlier scale function here so that we can see the result. (Click on the picture to reveal what's actually displayed.)
A common use of point functions is to equalise the intensity response of a camera. We work out the histogram of the camera response. This gives a function which can equalise the combined response of function*camera equal to unity, to give a constant intensity response. Let us suggest that the known performance of the camera is exponential. The equalising function is logarithmic since log(exp(q))=q. So let's see what it's like:
Now we can't see anything! This is because there are only two brightness levels in the image (it wasn't acquired by a camera with exponential performance). In order to show up more clearly what happens to images, we need to be able to manipulate their histograms. Intensity normalisation stretches a picture's histogram so that all available brightness levels are used. Having shifted the origin to 0, by subtracting the minimum brightness, we then scale up the brightness, by multiplying by some fraction of full range. It's also called histogram normalisation. Let's say we have a 8-bit pixels, giving 256 brightness levels (0..255), our function is:
Histogram equalisation is a nonlinear histogram-stretching process. The equalised histogram is a resampled cumulative histogram. We first work out the histogram of the picture, then we work out the cumulative histogram. Finally, we resample the cumulative histogram, giving a look up table to map original intensity levels to the equalised ones.
The main difference between equalisation and normalisation is that in normalisation all grey levels have the same 'weight': the process strecthes the histogram to occupy the available range. In equalisation, the histogram is resampled or manipulated, again to cover the available range. Since the histogram is manipulated, brightness values do not have the same weight.
This actually how Mathcad displays images when you display them using the surface plot facility (which is why we're using the picture facility instead, it's faster too!). Now try equalising the image brighter (as defined earlier) - do you expect the result you get?
If we want to find pixels with brightness above a specified level, we use thresholding. The operator is:
We'll now move on to group operators where the new pixel values are the result of analysing points in a region, rather than point operators which operate on single points. First, we have a template which describes the region of interest and we then convolve this by summing up, over the region of the template, the result of multiplying pixel values by the respective template (weighting) coefficient. We can't process the borders since part of the template falls outside the picture. Accordingly, we need an operator which sets an image to black, so that the borders in the final image are set to black. Black is zero brightness values, so the operator which sets a whole image to black is:
A 3*3 averaging (mean) operator sums the points in the 3*3 neighbourhood centred on the point of interest. This means we can't process a 1 pixel wide picture border so the operator first sets the whole output picture to black and then replaces points by the average of their 3*3 neighbourhood. A direct implementation of this process is
This is equivalent to convolving a template which has all elements set to unity and then dividing the result by the sum of the template coefficients. A general form of an averaging template operator (which can accept any template size) is
Note the blurring in the result, as well as the increased uniformity of the background, this is equivalent to reducing the background noise. Try other (odd) numbers for the size, say 5,7 or 9. Do you expect the observed effects? There is a mean operator in Mathcad which we shall use for future averaging operations, as:
with the same result. An alternative is to use the centre smoothing operation in Mathcad, put centsmooth in place of mean. To use the template convolution operator, tmconv, we need to define an averaging template:
Since there is a duality between convolution in the time domain and multiplication in the frequency domain, we can implement template convolution by using the Fourier transform. Template convolution is the inverse Fourier transform of the product of Fourier transform of the image with the transform of the template. First we need a picture of the template, this picture must be the same size as the image we want to convolve it with. For averaging, we need a 3*3 square in the centre of an image of the same size as the eye image:
Which shows that the difference is in the borders, the small differences in pixels' values are due to numerical considerations.
In image processing, we often use a Gaussian smoothing filter which can give a better smoothing performance than direct averaging. Here the template coefficients are set according to the Gaussian distribution which for a 2-dimensional distribution controlled by a variance s2 is, for a template size defined by winsize :
This gives the famous bell-shaped function shown here for a 19*19 window with standard deviation of 4. Try changing the standard deviation from 4 to, say, 2 and 8 so you can see its effect on the width.
This can keep much more detail concerning image features; note here its ability to retain detail in the eye region which was lost in the earlier direct averaging. Again, it can be implemented in the frequency domain, as can any template convolution process.
The mean and Gaussian operators are actually statistical operators since they provide estimates of the mean. There are other statistics, let's go for a median operator. This gives the midpoint of a sorted list. The list is derived from the pixel values within a specified area. We need to provide the sort function with a vector, so for a 3*3 neighbourhood centred on a point with co-ordinates x,y, we get
The main function of the median operator is to remove noise (especially salt and pepper noise) whilst retaining the edges of features in an image. You can't see that here, there is little image noise. So let's add some in:
If you make the value supplied as an argument to addcondiments smaller, you'll get less noise, larger values (0.3 say) result in greater noise contamination.
10/10 for the label of this image! Now we have introduced light (salt) and dark (pepper) points into the image. This type of noise is quite common in satellite image transmission decoding errors.
So let's see what our median operator can do with this image, in comparison with direct and Gaussian averaging:
The median operator clearly has a better ability to remove this type of noise. This is because it is removed when it is placed at either end of the rank sorted list. However, in direct and Gaussian averaging, the salt and pepper contributes to the final result.To run it again, with a different selection of noise points, just select the function noisy_p := addcondiments() and run it again. Each time, you'lll get a different pattern of noise. Check the filtering still works.
There is of course a median operator in Mathcad, but I thought I'd show you how it worked. Median filters can be implemented in an adaptive manner, or using non-square templates (e.g. cross or line, usually for computational reasons). We can get a Mathcad median by:
Finally, the last statistic is the mode. This is the peak of the probability distribution (the value most likely to occur). One way to estimate its value is to use the truncated median filter. It operates by taking the median of the distribution resulting from truncation of the distribution within a window at a point beyond thge mean, here's the code:
If you're on an old machine, go and make a cup of tea ,- and ring some friends up. It should be finished after that!
As ever, you pay for what you get. In application, is the complexity and speed of the more sophisticated approaches actually warranted?
The last operator we'll consider is anisotropic diffusion which smoothes images, whilst reatining the borders (edges). This invokes concepts of edge detection which are actually the topic of the next Chapter. Retaining edges is achieved by using a function which is 1 when the edge information (the difference in brightness) is small, and zero when the difference is large. The function is
Clearly, the value of k controls the amount of retention, with large values retaining more. This is then invoked in four compass directions
That is smart! Smoothing with edge preservation, unlike the Gaussian. Try varying k which controls the amount of edge preservation, and l which controls smoothing. The effect can be assessed by loooking at the difference from the original image, so that's given below.
The last operator we shall consider is the force field transform. This uses the gravitational equation, applied as an image processing operator. It is applied by convolution, via the Fourier transform.
This also has a filtering effect, and emphasises the low level features too. We developed it first as a pre-processor in ear biometrics, but it could hav other uses.
This completes our study of low-level image operators. Why not follow some of the suggestions below to extend/ improve your knowledge
Suggestions: i) investigate the effects of different window sizes;
ii) try out different values for the control parameters;
iii) try adding your own noise to the original image, and see the effects;
iv) try different template shapes for averaging and median operators; and v) try different images.
Now we'll move on to finding objects in images. So far, we've modified brightness so as to control an objects appearance in an image. But the change in brightness signifies and object's perimeter or border. So this can be used as a first step in finding the object. The is a basic feature, the subject of Chapter 4, Basic Feature Extraction.